Search results for "Word-sense induction"

showing 3 items of 3 documents

A practical solution to the problem of automatic word sense induction

2004

Recent studies in word sense induction are based on clustering global co-occurrence vectors, i.e. vectors that reflect the overall behavior of a word in a corpus. If a word is semantically ambiguous, this means that these vectors are mixtures of all its senses. Inducing a word's senses therefore involves the difficult problem of recovering the sense vectors from the mixtures. In this paper we argue that the demixing problem can be avoided since the contextual behavior of the senses is directly observable in the form of the local contexts of a word. From human disambiguation performance we know that the context of a word is usually sufficient to determine its sense. Based on this observation…

Computer sciencebusiness.industryWord-sense inductionComputer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing)Context (language use)Artificial intelligenceCluster analysiscomputer.software_genrebusinesscomputerWord (computer architecture)Natural language processingSemEvalProceedings of the ACL 2004 on Interactive poster and demonstration sessions -
researchProduct

Discovering the Senses of an Ambiguous Word by Clustering its Local Contexts

2005

As has been shown recently, it is possible to automatically discover the senses of an ambiguous word by statistically analyzing its contextual behavior in a large text corpus. However, this kind of research is still at an early stage. The results need to be improved and there is considerable disagreement on methodological issues. For example, although most researchers use clustering approaches for word sense induction, it is not clear what statistical features the clustering should be based on. Whereas so far most researchers cluster global co-occurrence vectors that reflect the overall behavior of a word in a corpus, in this paper we argue that it is more appropriate to use local context v…

Text corpusbusiness.industryComputer scienceContext (language use)computer.software_genreWord senseWord-sense inductionArtificial intelligencebusinessCluster analysiscomputerNatural language processingWord (computer architecture)Strengths and weaknesses
researchProduct

A practical solution to the problem of automatic part-of-speech induction from text

2005

The problem of part-of-speech induction from text involves two aspects: Firstly, a set of word classes is to be derived automatically. Secondly, each word of a vocabulary is to be assigned to one or several of these word classes. In this paper we present a method that solves both problems with good accuracy. Our approach adopts a mixture of statistical methods that have been successfully applied in word sense induction. Its main advantage over previous attempts is that it reduces the syntactic space to only the most important dimensions, thereby almost eliminating the otherwise omnipresent problem of data sparseness.

Vocabularybusiness.industryComputer sciencemedia_common.quotation_subjectSpeech recognitionSpace (commercial competition)Part of speechcomputer.software_genreSyntaxSet (abstract data type)Word-sense inductionArtificial intelligencebusinesscomputerNatural language processingWord (computer architecture)media_commonProceedings of the ACL 2005 on Interactive poster and demonstration sessions - ACL '05
researchProduct